105 research outputs found

    Learning to Convolve: A Generalized Weight-Tying Approach

    Get PDF
    Recent work (Cohen & Welling, 2016) has shown that generalizations of convolutions, based on group theory, provide powerful inductive biases for learning. In these generalizations, filters are not only translated but can also be rotated, flipped, etc. However, coming up with exact models of how to rotate a 3 x 3 filter on a square pixel-grid is difficult. In this paper, we learn how to transform filters for use in the group convolution, focussing on roto-translation. For this, we learn a filter basis and all rotated versions of that filter basis. Filters are then encoded by a set of rotation invariant coefficients. To rotate a filter, we switch the basis. We demonstrate we can produce feature maps with low sensitivity to input rotations, while achieving high performance on MNIST and CIFAR-10.Comment: Accepted to ICML 201

    Reversible GANs for Memory-efficient Image-to-Image Translation

    Full text link
    The Pix2pix and CycleGAN losses have vastly improved the qualitative and quantitative visual quality of results in image-to-image translation tasks. We extend this framework by exploring approximately invertible architectures which are well suited to these losses. These architectures are approximately invertible by design and thus partially satisfy cycle-consistency before training even begins. Furthermore, since invertible architectures have constant memory complexity in depth, these models can be built arbitrarily deep. We are able to demonstrate superior quantitative output on the Cityscapes and Maps datasets at near constant memory budget

    Interpretable Transformations with Encoder-Decoder Networks

    Full text link
    Deep feature spaces have the capacity to encode complex transformations of their input data. However, understanding the relative feature-space relationship between two transformed encoded images is difficult. For instance, what is the relative feature space relationship between two rotated images? What is decoded when we interpolate in feature space? Ideally, we want to disentangle confounding factors, such as pose, appearance, and illumination, from object identity. Disentangling these is difficult because they interact in very nonlinear ways. We propose a simple method to construct a deep feature space, with explicitly disentangled representations of several known transformations. A person or algorithm can then manipulate the disentangled representation, for example, to re-render an image with explicit control over parameterized degrees of freedom. The feature space is constructed using a transforming encoder-decoder network with a custom feature transform layer, acting on the hidden representations. We demonstrate the advantages of explicit disentangling on a variety of datasets and transformations, and as an aid for traditional tasks, such as classification.Comment: Accepted at ICCV 201

    Affine Self Convolution

    Get PDF
    Attention mechanisms, and most prominently self-attention, are a powerful building block for processing not only text but also images. These provide a parameter efficient method for aggregating inputs. We focus on self-attention in vision models, and we combine it with convolution, which as far as we know, are the first to do. What emerges is a convolution with data dependent filters. We call this an Affine Self Convolution. While this is applied differently at each spatial location, we show that it is translation equivariant. We also modify the Squeeze and Excitation variant of attention, extending both variants of attention to the roto-translation group. We evaluate these new models on CIFAR10 and CIFAR100 and show an improvement in the number of parameters, while reaching comparable or higher accuracy at test time against self-trained baselines

    A survey of X-ray emission from 100 kpc radio jets

    Full text link
    We have completed a Chandra snapshot survey of 54 radio jets that are extended on arcsec scales. These are associated with flat spectrum radio quasars spanning a redshift range z=0.3 to 2.1. X-ray emission is detected from the jet of approximately 60% of the sample objects. We assume minimum energy and apply conditions consistent with the original Felten-Morrison calculations in order to estimate the Lorentz factors and the apparent Doppler factors. This allows estimates of the enthalpy fluxes, which turn out to be comparable to the radiative luminosities.Comment: Conference Proceedings IAU Symposium No. 313, Extragalactic jets from every angle, pp. 219-224, 4 figure

    MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning

    Get PDF
    This paper introduces MDP homomorphic networks for deep reinforcement learning. MDP homomorphic networks are neural networks that are equivariant under symmetries in the joint state-action space of an MDP. Current approaches to deep reinforcement learning do not usually exploit knowledge about such structure. By building this prior knowledge into policy and value networks using an equivariance constraint, we can reduce the size of the solution space. We specifically focus on group-structured symmetries (invertible transformations). Additionally, we introduce an easy method for constructing equivariant network layers numerically, so the system designer need not solve the constraints by hand, as is typically done. We construct MDP homomorphic MLPs and CNNs that are equivariant under either a group of reflections or rotations. We show that such networks converge faster than unstructured baselines on CartPole, a grid world and Pong

    Uncertainty modelling in deep learning for safer neuroimage enhancement: Demonstration in diffusion MRI

    Get PDF
    Deep learning (DL) has shown great potential in medical image enhancement problems, such as super-resolution or image synthesis. However, to date, most existing approaches are based on deterministic models, neglecting the presence of different sources of uncertainty in such problems. Here we introduce methods to characterise different components of uncertainty, and demonstrate the ideas using diffusion MRI super-resolution. Specifically, we propose to account for intrinsic uncertainty through a heteroscedastic noise model and for parameter uncertainty through approximate Bayesian inference, and integrate the two to quantify predictive uncertainty over the output image. Moreover, we introduce a method to propagate the predictive uncertainty on a multi-channelled image to derived scalar parameters, and separately quantify the effects of intrinsic and parameter uncertainty therein. The methods are evaluated for super-resolution of two different signal representations of diffusion MR images—Diffusion Tensor images and Mean Apparent Propagator MRI—and their derived quantities such as mean diffusivity and fractional anisotropy, on multiple datasets of both healthy and pathological human brains. Results highlight three key potential benefits of modelling uncertainty for improving the safety of DL-based image enhancement systems. Firstly, modelling uncertainty improves the predictive performance even when test data departs from training data (“out-of-distribution” datasets). Secondly, the predictive uncertainty highly correlates with reconstruction errors, and is therefore capable of detecting predictive “failures”. Results on both healthy subjects and patients with brain glioma or multiple sclerosis demonstrate that such an uncertainty measure enables subject-specific and voxel-wise risk assessment of the super-resolved images that can be accounted for in subsequent analysis. Thirdly, we show that the method for decomposing predictive uncertainty into its independent sources provides high-level “explanations” for the model performance by separately quantifying how much uncertainty arises from the inherent difficulty of the task or the limited training examples. The introduced concepts of uncertainty modelling extend naturally to many other imaging modalities and data enhancement applications
    • …
    corecore